- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources4
- Resource Type
-
0002000002000000
- More
- Availability
-
31
- Author / Contributor
- Filter by Author / Creator
-
-
Sarker, Arup Kumar (4)
-
Fox, Geoffrey (3)
-
Perera, Niranda (3)
-
Staylor, Mills (3)
-
Abeykoon, Vibhatha (2)
-
Kamburugamuve, Supun (2)
-
Shan, Kaiying (2)
-
Widanage, Chathura (2)
-
von_Laszewski, Gregor (2)
-
Alsaadi, Aymen (1)
-
Fetea, Alex (1)
-
Jha, Shantenu (1)
-
Kanewala, Thejaka Amila (1)
-
Kanewela, Thejaka Amila (1)
-
Kilic, Ozgur O (1)
-
Lin, Felix Xiaozhu (1)
-
Merzky, Andre (1)
-
Titov, Mikhail (1)
-
Turilli, Matteo (1)
-
Zhong, Tianle (1)
-
- Filter by Editor
-
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
(submitted - in Review for IEEE ICASSP-2024) (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available December 21, 2025
-
Perera, Niranda; Sarker, Arup Kumar; Shan, Kaiying; Fetea, Alex; Kamburugamuve, Supun; Kanewala, Thejaka Amila; Widanage, Chathura; Staylor, Mills; Zhong, Tianle; Abeykoon, Vibhatha; et al (, Frontiers in High Performance Computing)The data engineering and data science community has embraced the idea of using Python and R dataframes for regular applications. Driven by the big data revolution and artificial intelligence, these frameworks are now ever more important in order to process terabytes of data. They can easily exceed the capabilities of a single machine but also demand significant developer time and effort due to their convenience and ability to manipulate data with high-level abstractions that can be optimized. Therefore it is essential to design scalable dataframe solutions. There have been multiple efforts to be integrated into the most efficient fashion to tackle this problem, the most notable being the dataframe systems developed using distributed computing environments such as Dask and Ray. Even though Dask and Ray's distributed computing features look very promising, we perceive that the Dask Dataframes and Ray Datasets still have room for optimization In this paper, we present CylonFlow, an alternative distributed dataframe execution methodology that enables state-of-the-art performance and scalability on the same Dask and Ray infrastructure (superchargingthem!). To achieve this, we integrate ahigh-performance dataframesystem Cylon, which was originally based on an entirely different execution paradigm, into Dask and Ray. Our experiments show that on a pipeline of dataframe operators, CylonFlow achieves 30 × more distributed performance than Dask Dataframes. Interestingly, it also enables superior sequential performance due to leveraging the native C++ execution of Cylon. We believe the performance of Cylon in conjunction with CylonFlow extends beyond the data engineering domain and can be used to consolidate high-performance computing and distributed computing ecosystems.more » « less
-
Perera, Niranda; Sarker, Arup Kumar; Staylor, Mills; von Laszewski, Gregor; Shan, Kaiying; Kamburugamuve, Supun; Widanage, Chathura; Abeykoon, Vibhatha; Kanewela, Thejaka Amila; Fox, Geoffrey (, Future Generation Computer Systems)
-
Sarker, Arup Kumar; Lin, Felix Xiaozhu (, HotMobile 2022)
An official website of the United States government
